Goto

Collaborating Authors

 robotic arm


e7663e974c4ee7a2b475a4775201ce1f-Supplemental-Conference.pdf

Neural Information Processing Systems

The key challenge in making this connection is grounding the skills, so that each skill corresponds to a specific goal-conditioned policy. We start by recalling the definition of the discounted state occupancymeasure(Eq.3): p(st+=sg)=(1 γ) X On the second line, we havechanged the bounds of the summation to start at 0, and changed the terms inside the summation accordingly. On the third line, we applied linearity of expectation to movethesummation insidetheexpectation. Onthefourthline,weappliedlinearity ofexpectation again to move the term fort = 0 inside the expectation. Finally, we substituted the definition of rg(s,a)toobtainthedesiredresult. This result means that we are doing policy improvement with approximate Q-values.



Haptic-based Complementary Filter for Rigid Body Rotations

Kumar, Amit, Campolo, Domenico, Banavar, Ravi N.

arXiv.org Artificial Intelligence

The non-commutative nature of 3D rotations poses well-known challenges in generalizing planar problems to three-dimensional ones, even more so in contact-rich tasks where haptic information (i.e., forces/torques) is involved. In this sense, not all learning-based algorithms that are currently available generalize to 3D orientation estimation. Non-linear filters defined on $\mathbf{\mathbb{SO}(3)}$ are widely used with inertial measurement sensors; however, none of them have been used with haptic measurements. This paper presents a unique complementary filtering framework that interprets the geometric shape of objects in the form of superquadrics, exploits the symmetry of $\mathbf{\mathbb{SO}(3)}$, and uses force and vision sensors as measurements to provide an estimate of orientation. The framework's robustness and almost global stability are substantiated by a set of experiments on a dual-arm robotic setup.


See-Control: A Multimodal Agent Framework for Smartphone Interaction with a Robotic Arm

Zhao, Haoyu, Ding, Weizhong, Yang, Yuhao, Tian, Zheng, Yang, Linyi, Shao, Kun, Wang, Jun

arXiv.org Artificial Intelligence

Recent advances in Multimodal Large Language Models (MLLMs) have enabled their use as intelligent agents for smartphone operation. However, existing methods depend on the Android Debug Bridge (ADB) for data transmission and action execution, limiting their applicability to Android devices. In this work, we introduce the novel Embodied Smartphone Operation (ESO) task and present See-Control, a framework that enables smartphone operation via direct physical interaction with a low-DoF robotic arm, offering a platform-agnostic solution. See-Control comprises three key components: (1) an ESO benchmark with 155 tasks and corresponding evaluation metrics; (2) an MLLM-based embodied agent that generates robotic control commands without requiring ADB or system back-end access; and (3) a richly annotated dataset of operation episodes, offering valuable resources for future research. By bridging the gap between digital agents and the physical world, See-Control provides a concrete step toward enabling home robots to perform smartphone-dependent tasks in realistic environments.


Closed-Loop Robotic Manipulation of Transparent Substrates for Self-Driving Laboratories using Deep Learning Micro-Error Correction

Fontenot, Kelsey, Gorti, Anjali, Goel, Iva, Buonassisi, Tonio, Siemenn, Alexander E.

arXiv.org Artificial Intelligence

Self-driving laboratories (SDLs) have accelerated the throughput and automation capabilities for discovering and improving chemistries and materials. Although these SDLs have automated many of the steps required to conduct chemical and materials experiments, a commonly overlooked step in the automation pipeline is the handling and reloading of substrates used to transfer or deposit materials onto for downstream characterization. Here, we develop a closed-loop method of Automated Substrate Handling and Exchange (ASHE) using robotics, dual-actuated dispensers, and deep learning-driven computer vision to detect and correct errors in the manipulation of fragile and transparent substrates for SDLs. Using ASHE, we demonstrate a 98.5% first-time placement accuracy across 130 independent trials of reloading transparent glass substrates into an SDL, where only two substrate misplacements occurred and were successfully detected as errors and automatically corrected. Through the development of more accurate and reliable methods for handling various types of substrates, we move toward an improvement in the automation capabilities of self-driving laboratories, furthering the acceleration of novel chemical and materials discoveries.


Human-Robot Collaboration for the Remote Control of Mobile Humanoid Robots with Torso-Arm Coordination

Boguslavskii, Nikita, Genua, Lorena Maria, Li, Zhi

arXiv.org Artificial Intelligence

Personal use of this material is permitted. Abstract -- Recently, many humanoid robots have been increasingly deployed in various facilities, including hospitals and assisted living environments, where they are often remotely controlled by human operators. Their kinematic redundancy enhances reachability and manipulability, enabling them to navigate complex, cluttered environments and perform a wide range of tasks. However, this redundancy also presents significant control challenges, particularly in coordinating the movements of the robot's macro-micro structure (torso and arms). Therefore, we propose various human-robot collaborative (HRC) methods for coordinating the torso and arm of remotely controlled mobile humanoid robots, aiming to balance autonomy and human input to enhance system efficiency and task execution. The proposed methods include human-initiated approaches, where users manually control torso movements, and robot-initiated approaches, which autonomously coordinate torso and arm based on factors such as reachability, task goal, or inferred human intent. We conducted a user study with N=17 participants to compare the proposed approaches in terms of task performance, manipulability, and energy efficiency, and analyzed which methods were preferred by participants. Human-robot collaborative (HRC) control enables humans and robot autonomy to complement each other and improve overall robotic manipulation performance.


Translating Cultural Choreography from Humanoid Forms to Robotic Arm

Chen, Chelsea-Xi, Zhang, Zhe, Zhou, Aven-Le

arXiv.org Artificial Intelligence

Robotic arm choreography often reproduces trajectories while missing cultural semantics. This study examines whether symbolic posture transfer with joint space compatible notation can preserve semantic fidelity on a six-degree-of-freedom arm and remain portable across morphologies. We implement ROPERA, a three-stage pipeline for encoding culturally codified postures, composing symbolic sequences, and decoding to servo commands. A scene from Kunqu opera, \textit{The Peony Pavilion}, serves as the material for evaluation. The procedure includes corpus-based posture selection, symbolic scoring, direct joint angle execution, and a visual layer with light painting and costume-informed colors. Results indicate reproducible execution with intended timing and cultural legibility reported by experts and audiences. The study points to non-anthropocentric cultural preservation and portable authoring workflows. Future work will design dance-informed transition profiles, extend the notation to locomotion with haptic, musical, and spatial cues, and test portability across platforms.


The Download: AI-powered warfare, and how embryo care is changing

MIT Technology Review

Plus: Why other industries are keeping such a close eye on Big Tech's job cuts It is July 2027, and China is on the brink of invading Taiwan. Autonomous drones with AI targeting capabilities are primed to overpower the island's air defenses as a series of crippling AI-generated cyberattacks cut off energy supplies and key communications. In the meantime, a vast disinformation campaign enacted by an AI-powered pro-Chinese meme farm spreads across global social media, deadening the outcry at Beijing's act of aggression. Scenarios such as this have brought dystopian horror to the debate about the use of AI in warfare. Military commanders hope for a digitally enhanced force that is faster and more accurate than human-directed combat. But there are fears that as AI assumes an increasingly central role, these same commanders will lose control of a conflict that escalates too quickly and lacks ethical or legal oversight.


AI-Driven Robotics for Optics

Uddin, Shiekh Zia, Vaidya, Sachin, Choudhary, Shrish, Chen, Zhuo, Salib, Raafat K., Huang, Luke, Englund, Dirk R., Soljačić, Marin

arXiv.org Artificial Intelligence

Optics is foundational to research in many areas of science and engineering, including nanophotonics, quantum information, materials science, biomedical imaging, and metrology. However, the design, assembly, and alignment of optical experiments remain predominantly manual, limiting throughput and reproducibility. Automating such experiments is challenging due to the strict, non-negotiable precision requirements and the diversity of optical configurations found in typical laboratories. Here, we introduce a platform that integrates generative artificial intelligence, computer vision, and robotics to automate free-space optical experiments. The platform translates user-defined goals into valid optical configurations, assembles them using a robotic arm, and performs micrometer-scale fine alignment using a robot-deployable tool. It then executes a range of automated measurements, including beam characterization, polarization mapping, and spectroscopy, with consistency surpassing that of human operators. This work demonstrates the first flexible, AI-driven automation platform for optics, offering a path towards remote operation, cloud labs, and high-throughput discovery in the optical sciences.


MLM: Learning Multi-task Loco-Manipulation Whole-Body Control for Quadruped Robot with Arm

Liu, Xin, Ma, Bida, Qi, Chenkun, Ding, Yan, Xu, Nuo, Zhaxizhuoma, null, Zhang, Guorong, Chen, Pengan, Liu, Kehui, Jia, Zhongjie, Guan, Chuyue, Mo, Yule, Liu, Jiaqi, Gao, Feng, Zhong, Jiangwei, Zhao, Bin, Li, Xuelong

arXiv.org Artificial Intelligence

Whole-body loco-manipulation for quadruped robots with arms remains a challenging problem, particularly in achieving multi-task control. To address this, we propose MLM, a reinforcement learning framework driven by both real-world and simulation data. It enables a six-DoF robotic arm-equipped quadruped robot to perform whole-body loco-manipulation for multiple tasks autonomously or under human teleoperation. To address the problem of balancing multiple tasks during the learning of loco-manipulation, we introduce a trajectory library with an adaptive, curriculum-based sampling mechanism. This approach allows the policy to efficiently leverage real-world collected trajectories for learning multi-task loco-manipulation. To address deployment scenarios with only historical observations and to enhance the performance of policy execution across tasks with different spatial ranges, we propose a Trajectory-Velocity Prediction policy network. It predicts unobservable future trajectories and velocities. By leveraging extensive simulation data and curriculum-based rewards, our controller achieves whole-body behaviors in simulation and zero-shot transfer to real-world deployment. Ablation studies in simulation verify the necessity and effectiveness of our approach, while real-world experiments on a Go2 robot with an Airbot robotic arm demonstrate the policy's good performance in multi-task execution.